Robust energy demodulation based on continuous models with application to speech recognition
نویسندگان
چکیده
In this paper, we develop improved schemes for simultaneous speech interpolation and demodulation based on continuous-time models. This leads to robust algorithms to estimate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features for ASR. The continous-time models retain the excellent time resolution of the ESAs based on discrete energy operators and perform better in the presence of noise. We also introduce a robust algorithm based on the ESAs for amplitude compensation of the filtered signals. Furthermore, we use robust nonlinear modulation features to enhance the classic cepstrum-based features and use the augmented feature set for ASR applications. ASR experiments show promising evidence that the robust modulation features improve recognition.
منابع مشابه
Continuous-time models for AM-FM signal demodulation and their application to speech recognition
Automatic speech recognition (ASR) systems can benefit from including into their acoustic processing part new features that account for various nonlinear and time-varying phenomena during speech production. In this paper, we develop robust continuoustime expansions used to demodulate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features from s...
متن کاملContinuous energy demodulation methods and application to speech analysis
Speech resonance signals appear to contain significant amplitude and frequency modulations. An efficient demodulation approach is based on energy operators. In this paper, we develop two new robust methods for energy-based speech demodulation and compare their performance on both test and actual speech signals. The first method uses smoothing splines for discrete-to-continuous signal approximat...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملAm-demodulation of Speech Spectra and Its Application to Noise Robust Speech Recognition
In this paper, a novel algorithm that resembles amplitude demodulation in the frequency domain is introduced, and its application to automatic speech recognition (ASR) is studied. Speech production can be regarded as a result of amplitude modulation (AM) with the source (excitation) spectrum being the carrier and the vocal tract transfer function (VTTF) being the modulating signal. From this po...
متن کامل